Real-valued All-Dimensions Search: Low-overhead Rapid Searching over Subsets of Attributes

نویسندگان

  • Andrew W. Moore
  • Jeff G. Schneider
چکیده

This paper is about searching the combina­ torial space of contingency tables during the inner loop of a nonlinear statistical optimiza­ tion. Examples of this operation in various data analytic communities include search­ ing for nonlinear combinations of attributes that contribute significantly to a regression (Statistics), searching for items to include in a decision list (machine learning) and associ­ ation rule hunting (Data Mining). This paper investigates a new, efficient ap­ proach to this class of problems, called RAD­ SEARCH (Real-valued All-Dimensions-tree Search). RADSEARCH finds the global op­ timum, and this gives us the opportunity to empirically evaluate the question: apart from algorithmic elegance what does this attention to optimality buy us? We compare RADSEARCH with other recent successful search algorithms such as CN2, PRIM, APriori, OPUS and DenseMiner. F i­ nally, we introduce RADREG, a new regres­ sion algorithm for learning real-valued out­ puts based on RADSEARCHing for high­ order interactions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FUZZY HYPERVECTOR SPACES OVER VALUED FIELDS

In this note we first redefine the notion of a fuzzy hypervectorspace (see [1]) and then introduce some further concepts of fuzzy hypervectorspaces, such as fuzzy convex and balance fuzzy subsets in fuzzy hypervectorspaces over valued fields. Finally, we briefly discuss on the convex (balanced)hull of a given fuzzy set of a hypervector space.

متن کامل

Detecting Anomalous Groups in Categorical Datasets

We propose a new method for detecting groups of anomalies in categorical datasets. Our approach is a generalization of the spatial scan statistic, a commonly used method for detecting clusters of increased counts in spatial data. We extend this framework to non-spatial datasets with discrete valued attributes, where the degree of anomalousness of each record depends on its attribute values and ...

متن کامل

An Evolutionary Algorithm Integrating Discretization of Continuous-valued Attributes with Learning Decision Rules

A new method of learning decision rules from databases, which uses an evolutionary algorithm, is proposed. The main diierence between our approach and the others described in the literature is the way of processing of continuous-valued attributes. Most decision rule learners process separately these attributes when searching for threshold values, which may decrease the performance. In contrast ...

متن کامل

Real-Valued Schemata Search Using Statistical Confidence

2 Neural Network & Machine Learning Laboratory Computer Science Department Brigham Young University Provo, UT 84602, USA E-mail: [email protected] WWW: http://axon.cs.byu.edu Abstract. Many neural network models must be trained by finding a set of real-valued weights that yield high accuracy on a training set. Other learning models require weights on input attributes that yield high leave-one...

متن کامل

An Efficient Scheme for Real-time Information Storage and Retrieval Systems: A Hybrid Approach

Information storage and retrieval is the fundamental requirement for many real-time applications. These systems demand that data should be sorted all the time, real-time insertion, deletion and searching should be supported and system must support dynamic entries. These systems require search operations to be performed from massive databases implemented by various data structures. The common da...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002